Day-27 C 語言, 變數範圍, volatile, inline

2022 iThome 鐵人賽

DAY 28

Software Development

與作業系統的第一類接觸 : 探索 xv6系列第 28 篇

14th鐵人賽 risc-v xv6 作業系統 c

吳他，惟手熟爾

團隊那團名要叫什麼？

2022-10-13 23:42:02

11574 瀏覽

分享至

前言

在研讀 xv6 專案程式碼的過程，我們可能會看到各種 C 語言的修飾字以及相關用法，諸如 static, extern, violiate 等等，我們將會在本篇對這一些用法進行一些整理。

變數範圍 static, extern

我們在 proc.c 中，我們會看到以下用法

#include "types.h"
#include "param.h"
#include "memlayout.h"
#include "riscv.h"
#include "spinlock.h"
#include "proc.h"
#include "defs.h"

struct cpu cpus[NCPU];

struct proc proc[NPROC];

struct proc *initproc;

int nextpid = 1;
struct spinlock pid_lock;

extern void forkret(void);
static void freeproc(struct proc *p);

extern char trampoline[]; // trampoline.S
...

我們看到了幾個關鍵的修飾字，一個為 static, 另外一個為 extern，以下進行說明

static

在 C 語言中，我們會看到兩種 static 的用法，分別為上面看到的宣告在 heap 中的帶有 static 修飾字的變數，如 static void freeproc(struct proc *p)，以及宣告在 function 內，帶有 static 修飾字的變數

function 內的 static 變數

如果是宣告在 function 內的 static 變數，則該變數的生命週期不會因為離開 function 而清除該變數，以下舉例

#include <stdio.h>

void func(void)
{
    int a = 0;
    static int b = 0;

    printf("a = %d, b = %d\n", ++a, ++b);
}
int main(void) {
    for(int i = 0; i < 10; i++)
        func();
    return 0;
}

output:

a = 1, b = 1
a = 1, b = 2
a = 1, b = 3
a = 1, b = 4
a = 1, b = 5
a = 1, b = 6
a = 1, b = 7
a = 1, b = 8
a = 1, b = 9
a = 1, b = 10

我們在 for 迴圈的每一次跌代都會呼叫 func()，然後回傳並結束，下一個跌代再次呼叫 func()，接著回傳並結束，可以看到輸出的結果是 a 每一次都是 1，而 b 的數值隨著每一次呼叫都有變動，這是因為在離開 func() 時，a 會在記憶體中被清除，而 b 不會因為離開 func() 而被清除，會繼續留在記憶體中，因此在下一次跌代的 func() 呼叫，我們會看到上一次跌代 b 的數值，因此每一次呼叫都可以看到 b 數值的變化。

宣告在 function 的 static 變數，意義為在呼叫之間，會保持其數值，該數值位於記憶體中，雖然這邊 b 還位於記憶體中，但我們無法通過 main() 對 b 進行存取。

使用 static 變數在我們想要儲存函式呼叫時的狀態，同時不想使用全域變數時十分的好用 (全域變數容易有變數污染的問題)

在初始化 static 的時候，我們需要注意 static 初始化只能使用 constant literals 進行初始化，constant literals 表示在整個程式執行過程中不會發生改變的數值，諸如 int a = 2 的這個 2 就是一個 constant literals，定義後就不能夠改變了，所以，以下初始化 static 變數的方式會出現錯誤

#include <stdio.h>
int func(void)
{
    return 100;
}
int main(void)
{
    static int a = func();
    printf("%d\n", a);
}

output:

helloworld.c:8:20: error: initializer element is not constant
     static int a = func();
                    ^~~~

帶有 static 修飾的 function

在 proc.c 中，我們看到了 static void freeproc(struct proc *p) 這個帶有 static 修飾字的 function，這邊的意義為該 function 只能夠被目前的檔案所存取，無法從其他檔案去存取這一個 function，意義上是一種存取控制 (Access Control)

extern

在了解 extern 之前，我們要先了解 Declaration 和 Defination 之間的差異

Declaration: 只是告知編譯器變數或是 function 位在程式的某一個地方，但是還沒有為他們分配記憶體，概念上很像 function 的 prototype，如果我們把 function 寫在 main() 之後，而我們想要在 main() 中呼叫該 function，我們需要在 main() 之前寫一個 function 的 prototype，從 prototype 我們可以知道 function 需要接收怎樣的 argument，接收的順序，以及 function 回傳的型別，舉例如下
```
#include <stdio.h>
int func(int, int);//prototype
int main(void)
{
    printf("a + b = %d", func(a, b));
}

int func(int a, int b)
{
    return a + b;
}
```
概念上為上面所展示
Defination: 當我們 Declar 一個 function 或是變數，我們接下來需要為他們分配記憶體空間，也就是 Defination 的部分，如果我們加上了 extern 這個關鍵字，表示這個變數或是 function 可以被其他檔案所使用，且編譯器看到了 extern 這個關鍵字，表示告知編譯器到 Declaration 以外的 scope 去尋找他的 Defination，假設我們使用下面 function prototype
```
int func(int, int);
```
在編譯器中，會隱性的轉換為以下
```
extern int func(int, int);
```
表示到其他地方去尋找這個 function 的 Defination，我們可以進行以下測試
以下為 test.c
```
#include <stdio.h>

int b = 10;
int func(int a, int b)
{
    return a + b;
}
```
而我們在 hello.c 中引入 test.c，並在 hello.c 中使用 extern 告知編譯器到其他檔案尋找 b 和 func() 的 Defination
```
#include <stdio.h>
#include <stdlib.h>
#include "test.c"

extern int b;
extern int func(int, int);

int main(void)
{
    printf("%d\n", func(2,3));
    printf("%d", b);
}
```
output:
```
5
10
```

比較 Declaration 和 Defination

下面為一個有效的 Declaration

#include <stdio.h>
#include <stdlib.h>
extern int a;
int main(void)
{
    return 0;
}

而下面混用 Declaration 與 Defination 為無效的，因為 Decleration 的變數並沒有分配記憶體空間，無法對其進行 Defination

#include <stdio.h>
#include <stdlib.h>
extern int a;
int main(void)
{
    a = 10;
    return 0;
}

這裡有一個有趣的地方，可以發現發生錯誤的地方並不在編譯器，而是在組譯器 (ld 為組譯器)

C:\Users\Users\AppData\Local\Temp\ccWel9Fi.o:hello.c:(.text+0xd): undefined reference to `a'
collect2.exe: error: ld returned 1 exit status

可以發現這邊的錯誤為 a 為 undefined，這件事情十分的合理，因為我們只對 a 進行 Declare，而沒有進行 Define，因此這邊組譯器出現的錯誤為 undefined。編譯器會相信我們確實存在一個已經定義的 a 位於其他的檔案中 (outside the scope)，而在組譯器中會發現找不到定義。

要修正這個錯誤，只要我們在其他檔案有 a 的 Defination，我們就可以使用，如以下展示，以下為 test.c

int a;

#include <stdio.h>
#include <stdlib.h>
#include "test.c"

extern int a;
int main(void)
{
    a = 10;
    return 0;
}

可以發現能夠成功編譯並執行。如果我們嘗試以下，在 Declaration 做 Defination，依照 C 語言的標準，他會幫我們分配一塊記憶體到這個 Declaration，所以這個變數就會是 Defined 了，我們嘗試編譯執行看看。

#include <stdio.h>
#include <stdlib.h>

extern int a = 10;
int main(void)
{
    return 0;
}

可以發現能夠成功執行，但會產生警告

hello.c:4:12: warning: 'a' initialized and declared 'extern'
 extern int a = 10;

比較 static 與 extern

使用到 extern，意味著這個變數或是函式在整個檔案都是可見的，而 static 表示只有在該函式範圍內是可見的，以下測試，我們將 b, func() Defination 放置在 test.c 中

int b;

以下為 hello.c

#include <stdio.h>
#include <stdlib.h>
#include "test.c"

int main(void)
{
    extern int b;
    extern void func(void);
    for(int i = 0; i < 10; i++)
        func();
}

void func(void)
{
    static int a = 0;
    printf("%d %d\n", ++a, ++b);
}

這裡可以看到我們在 func() 中沒有 b 的 Declearation，卻能夠存取到 b，這是因為我們在 main() 使用 extern 修飾 b，表示 b 能夠被整個檔案所看見 (Scope 的概念)。

而回到 proc.c 中的 extern char trampoline[] 和 static void freeproc(struct proc *p)，freeproc() 可以看到只能在 proc.c 中使用

我們搜尋整個 xv6-riscv 的 repository 也可以發現，而我們看到 extern char trampoline[]，在 proc.c Declaration，在 trapoline.S 中 Define，從這裡我們也可以看到，可以在很多地方 Declaration，但是只有唯一的 Define

volatile

volatile 我們前面我稍微提及，這邊我們進行詳細的說明，以及進行一些簡單的實驗，volatile 修飾的變數會強制從記憶體位置中讀取該變數的數值，而不會因為一些編譯器的優化，諸如先將該變數載入到暫存器，而後從暫存器對該變數進行存取，volatile 會禁止這一件事情的發生，我們假設以下程式碼

環境: x86-64 gcc 12.2 (使用 https://godbolt.org/)

#include <stdio.h>
int main(void)
{
    const int local = 10;
    int *ptr = (int *)&local;

    printf("Initial value of local : %d \n", local);
    *ptr = 100;
    printf("Modified value of local: %d \n", local);
    return 0;
}
//Exmaple from geeksforgeeks

gcc -O0 hello.c -o hello
./hello

output:

Initial value of local : 10 
Modified value of local: 100

在沒有優化的時候，結果為以下，表示沒有優化的時候，編譯器確實是從記憶體中讀取該數值，而如果我們加上優化選項，我們預期編譯器會到暫存器中讀取該數值 (這邊發生了非編譯器預期的行為，預期情況下，const 修飾的變數不應該被修改)

gcc -O3 hello.c -o hello
./hello

output:

Initial value of local : 10 
Modified value of local: 10

可以發現我們加入優化選項 (-O3) 之後，編譯器可能從暫存器中直接讀取數值，造成了上面的輸出現象，而我們這時候對 local 加上 volatile 修飾，我們就可以強制編譯器獲得 local 的數值，需要從其記憶體中讀取

#include <stdio.h>
int main(void)
{
    volatile const int local = 10;
    int *ptr = (int *)&local;

    printf("Initial value of local : %d \n", local);
    *ptr = 100;
    printf("Modified value of local: %d \n", local);
    return 0;
}
//Example from geeksforgeeks

output:

Initial value of local : 10 
Modified value of local: 100

可以看到這邊就成功避免優化所帶來的影響了，而我們可以看到 xv6 在哪一些地方使用到了 volatile
main.c

#include "types.h"
#include "param.h"
#include "memlayout.h"
#include "riscv.h"
#include "defs.h"

volatile static int started = 0;
...

在最一開始 xv6 啟動與架構中我們看到 started 使用 volatile 進行修飾，什麼時候會產生非編譯器預期情況的修改? 其中一個情況為該變數被多個 Process 共享，變數會被其他 Process 修改用來儲存或是共享一些訊息，像是這邊的 started 就是被多個 hart (Core) 所共享，如果我們這邊沒有使用 volatile 對 started 進行修飾，就可能會發生重複初始化 console (重複執行 consoleinit())，重複初始化 Interrupt, virtual memory page 等等，會造成 xv6 啟動的一些問題，因此，這邊我們需要為 started 這個被多個 hart (Core) 共享的變數使用 volatile 進行修飾。

inline

我們在 xv6 中，可以看到以下程式片段

static inline void 
w_stvec(uint64 x)
{
  asm volatile("csrw stvec, %0" : : "r" (x));
}

可以看到這個函式被 inline 所修飾，在了解 inline 之前，我們可以先回顧一下一般的函式呼叫行為，可以想像我們會需要一個 stack 去儲存我們的函式呼叫的地址，讓我們在執行結束函式呼叫時，能夠回到呼叫者的函式，而這樣一來一回的記憶體跳轉會需要花費一些時間，而 inline 的意思為將函式如同巨集一般展開，如果我們將函式進行展開，則可以避免掉 stack 以及記憶體跳轉的時間消耗，以下舉例，我們在 xv6 中看到以下程式碼

void
usertrap(void)
{
  int which_dev = 0;

  if((r_sstatus() & SSTATUS_SPP) != 0)
    panic("usertrap: not from user mode");
  
  w_stvec((uint64)kernelvec);

進行了函式展開，會變成以下

void
usertrap(void)
{
  int which_dev = 0;

  if((r_sstatus() & SSTATUS_SPP) != 0)
    panic("usertrap: not from user mode");
  
  x = (uint64)kernelvec;
  asm volatile("csrw stvec, %0" : : "r" (x));

可以節省一些時間上的消耗，inline 功用為建議編譯器可以將該函式展開，用來加快速度，而編譯器會自行判斷，是要將其展開速度比較快，還是使用一般常規的函式呼叫。如果 inline 修飾的函式中程式碼輛較為巨大，則節省的時間較少，而如果修飾的程式碼量較少，且常常在各個地方進行呼叫 (上面的 w_stvec() 就有這一樣特性)，則編譯器會考慮將其展開。

所以，一般在程式碼量少，且常常呼叫到的函式，我們在前面會加上 inline 這個修飾字讓編譯器決定相關的優化，可以看到 xv6 中許多涉及 CSR 的操作函式都有使用 inline 進行修飾。